The object of this exercise is to find the UUIDs for processes dealing with sugar beet production.
The plan is:
In [1]:
import json
import re
In [2]:
with open('../catalogs/json/ecoinvent_3.2_undefined_xlsx.json') as fp:
ei32 = json.load(fp)
In [3]:
def search_tags(entity, search):
"""
This function searches through all the 'tags' (semantic content) of a data set
and returns 'true' if the search expression is found. case insensitive.
"""
all_tags = '; '.join([str(x) for x in entity['tags'].values()])
return bool(re.search(search, all_tags, flags=re.IGNORECASE))
In [4]:
beets = [flow for flow in ei32['flows'] if search_tags(flow,'beet')]
In [5]:
len(beets)
Out[5]:
11 beet-related flows
In [6]:
[b['tags']['Name'] for b in beets]
Out[6]:
In [7]:
beet_processes = [x['process'] for x in ei32['exchanges'] if x['flow'] in beets]
In [8]:
len(beet_processes)
Out[8]:
that didn't work... it's because 'beets' is a list of flow objects, but exchange
entries only list references to flow objects
In [9]:
beet_refs = [b['dataSetReference'] for b in beets]
In [10]:
beet_processes = [x['process'] for x in ei32['exchanges'] if x['flow'] in beet_refs]
In [11]:
len(beet_processes)
Out[11]:
This can include duplicates... set
forces the entries to be unique (set members)
In [12]:
len(set(beet_processes))
Out[12]:
In [13]:
[p for p in ei32['processes'] if p['dataSetReference'] == beet_processes[0]]
Out[13]:
use pandas
to print the processes as a table
In [14]:
import pandas as pd
In [15]:
p_list = [p for p in ei32['processes'] if p['dataSetReference'] in set(beet_processes)]
p_list
is a list of full process records -- we want to view all the semantic content of these records -- all the tags
In [16]:
P = pd.DataFrame([p['tags'] for p in p_list],
index=[p['dataSetReference'] for p in p_list])
In [17]:
P
Out[17]:
Write this info to a CSV file in the current directory
In [18]:
P.to_csv('beet_processes.csv')
the UUIDs in the index list can be used to identify ecospold XML files to retrieve for further info (including input requirements and exchange values)